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Abstract 

AIPEG  video  traffic  is  expected  to  cause  several  problems  in  ATM  networks,  both  from 
performance  and  from  architectural  viewpoint.  For  the  solution  of  these  difficulties, 
appropriate  video  traffic  models  are  needed.  A  detailed  statistical  analysis  of  newly 
generated  long  MPEG  encoded  video  sequences  is  presented  and  the  results  are  compared 
to  those  of  existing  data  sets.  Based  on  the  results  of  the  analysis,  a  layered  modeling 
scheme  for  MPEG  video  traffic  is  suggested  which  will  simplify  the  hnding  of  appropriate 
models  for  a  lot  performance  analysis  techniques. 
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1  Introduction 


In  B-ISDNs,  a  major  part  of  the  traffic  will  be  produced  by  multimedia  sources  like 
teleconferencing  terminals  and  video-on-demand  servers.  These  networks  will  work  on 
the  basis  of  ATM  and  most  of  the  video  encoding  will  be  done  using  the  MPEG  standard 
(ISO  Moving  Picture  Expert  Group). 

There  are  a  number  of  open  issues  concerning  the  transmission  of  MPEG  video  on  high¬ 
speed  networks  like  Ending  of  the  appropriate  ATM  adaption  layer,  dimensioning  of  the 
multiplexer  buffers,  shaping  of  video  traffic,  and  monitoring  of  video  cell  streams.  To 
solve  these  problems  several  performance  analyses  has  to  be  done  and  therefore  models 
for  MPEG  video  traffic  streams  have  to  be  developed.  The  first  step  of  the  model  deve¬ 
lopment  is  a  thorough  analysis  of  the  statistical  data  sets  of  already  encoded  videos. 

At  institute  of  Computer  Science  at  Wurzburg,  we  encoded  a  variety  of  video  sequences 
and  carried  out  a  thorough  statistical  analysis  to  get  a  detailed  picture  of  the  video  data 
stream:  moments,  histograms,  QQ-plots,  autocorrelation  functions  of  frame  and  GOP 
sizes,  R/S-plots.  Based  on  this  information  and  the  knowledge  about  the  MPEG  coding 
technique,  we  propose  a  layered  video  modeling  scheme.  The  model  can  consist  of  GOP, 
frame,  and  cell  layer,  depending  on  the  requirements  of  the  analysis.  Eor  each  layer 
certain  stochastic  processes  are  suggested,  which  may  be  used  for  modeling. 

In  Section  2,  we  outline  the  MPEG  video  encoding  technique.  Section  3.1  is  about  the 
statistical  analysis  of  the  encoded  sequences  and  in  Section  3.2  the  layered  modeling 
scheme  is  presented.  Section  4  concludes  the  paper. 


2  MPEG  video  encoding 

Due  to  the  high  bandwidth  needs  of  uncompressed  video  data  streams,  several  coding 
algorithms  for  the  compression  of  these  streams  were  developed. 

At  the  moment,  the  MPEG  coding  scheme  is  widely  used  for  any  type  of  video  appli¬ 
cations.  There  are  two  schemes,  MPEG-I  [7,  6]  and  MPEG-II  [2],  where  the  MPEG-I 
functionalities  are  a  subset  of  the  MPEG-II  ones.  The  main  difference  with  respect  to 
video  transmission  on  ATM  is  that  MPEG-II  allows  for  layered  coding.  This  means  the 
video  data  stream  consists  of  a  base  layer  stream,  which  contains  the  most  important 
video  data,  and  of  one  or  more  enhancement  layers,  which  can  be  used  to  improve  the 
quality  of  the  video  sequence. 

In  this  paper,  we  focus  on  one-layer  video  data  streams  of  MPEG-I  type.  Most  of  the 
encoders  will  use  this  scheme  and  in  case  of  multi-layer  encoding  the  statistical  properties 
of  the  base  layer  will  be  almost  identical  to  this  type  of  stream. 

The  MPEG  encoder  input  sequence  consists  of  a  series  of  frames,  each  containing  a  two- 
dimensional  array  of  picture  elements,  called  pels.  The  number  of  frames  per  second  as 
well  as  the  number  of  lines  per  frame  and  pels  per  line  depend  on  national  standards. 
Eor  each  pel,  both  luminance  and  chrominance  information  is  stored.  The  compression 
algorithm  is  used  to  reduce  the  data  rate  before  transmitting  the  video  stream  over 
communication  networks. 
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forward  prediction 


Figure  1:  Group  of  Pictures  of  an  MPEG  stream 

This  is  done  by  both  reducing  the  spatial  and  the  temporal  redundancy  of  the  video  data 
stream.  The  spatial  redundancies  are  reduced  by  transforms  and  entropy  coding  and 
the  temporal  redundancies  are  reduced  by  prediction  of  future  frames  based  on  motion 
vectors.  This  is  achieved  using  three  types  of  frames  (cf.  Figure  1): 

I- frames  use  only  intra- frame  coding,  based  on  the  discrete  cosine  transform  and  entropy 
coding; 

P-frames  use  a  similar  coding  algorithm  to  Tframes,  but  with  the  addition  of  motion 
compensation  with  respect  to  the  previous  T  or  P-frame; 

B-frames  are  similar  to  P-frames,  except  that  the  motion  compensation  can  be  with 
respect  to  the  previous  1-  or  P-frame,  the  next  T  or  P-frame,  or  an  interpolation 
between  them. 

Typically,  Tframes  require  more  bits  than  P-frames.  B-frames  have  the  lowest  bandwidth 
requirement. 

After  coding,  the  frames  are  arranged  in  a  deterministic  periodic  sequence,  e.g.  “IBB- 
PBB”  or  “IBBPBBPBBPBB”,  which  is  called  Group  of  Pictures  (GOP). 


3  Modeling  of  MPEG  video  traffic 

There  are  several  reasons  to  develop  models  for  video  traffic  and  to  use  them  for  the 
performance  analysis  of  ATM  networks. 

The  hrst  reason  is  to  extract  the  statistical  properties  of  video  traffic  which  have  a 
remarkable  impact  on  the  network  performance.  We  gain  a  lot  of  insight,  if  we  are  able  to 
reduce  the  statistical  complexity  of  the  empirical  video  data  sets,  ft  is  true,  that  only  the 
frame  size  trace  from  the  output  of  a  MPEG  encoder  contains  all  statistical  information 
about  the  encoded  video,  but  on  the  other  hand  the  large  number  of  properties  makes  it 
difficult  to  decide  which  one  is  causing  performance  problems. 
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Movies  (buy  cassettes) 

dino 

Jurassic  Park 

lambs 

The  Silence  of  the  Lambs 

TV  sports  events  (recorded  from  cable  TV) 

soccer 

Soccer  World  Cup  1994  Final:  Brazil  -  Italy 

race 

Formula  1  car  race  at  Hockenheim/ Germany  1994 

atp 

ATP  Tennis  Final  1994:  Becker  -  Sampras 

Other  TV  sequences  (recorded  from  cable  TV) 

terminator 

Terminator  2 

talkl 

German  talk  show 

talk2 

Political  discussion 

simpsons 

Cartoon 

asterix 

Cartoon 

mr.  bean 

Three  slapstick  episodes 

news 

German  news  show 

mtv 

Music  clips 

Set  top  camera 

settop 

Student  sitting  in  front  of  workstation 

Table  1:  Overview  of  encoded  sequences 

The  second  reason  is  the  computational  complexity  of  simulations,  particularly  on  cell 
level,  of  ATM  networks,  ft  often  takes  long  simulation  runs  to  obtain  results  of  high 
accuracy.  In  some  cases  the  numerical  complexity  can  be  considerably  reduced  using 
traffic  models  and  standard  analytical  tools  like  matrix  analysis  or  discrete  time  analysis. 

The  third  reason  is  the  need  for  connection  traffic  descriptors  for  video  traffic.  If  the 
traffic  model  is  simple,  i.e.  it  has  only  a  small  number  of  parameters,  these  parameters 
can  be  used  as  traffic  descriptors  for  CAC  and  UPC  of  video  connections. 

For  the  development  of  video  traffic  models  we  can  both  use  the  knowledge  about  the 
coding  technique,  MPEG-1  or  MPEG-11  in  our  case,  and  the  statistical  analysis  of  the 
frame  size  sequence  which  we  obtain  from  measurements. 


3.1  Statistical  analysis  of  MPEG  video  sequences 

In  the  following,  we  will  present  some  statistical  measurements  from  several  movies,  TV 
sport  events,  and  TV  shows^,  which  we  encoded  at  our  institute  using  the  UC  Berkeley 
MPEG-f  software  encoder  [5].  Table  1  shows  the  sequences  which  we  used  to  produce 
the  data  sets. 

All  sequences  mentioned  below  were  encoded  using  the  following  parameter  set: 

^To  avoid  any  conflict  with  copyright  laws,  we  want  to  point  out,  that  all  image  processing,  encoding, 
and  analysis  work  was  made  for  scientific  purposes.  The  encoded  sequences  have  no  audio  stream  and 
will  not  be  made  publicly  available.  Only  statistical  data  sets  will  be  made  available  to  colleagues. 
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Sequence 

Compr. 

rate 

X  :  1 

Frames 

GOPs 

Bit  rate 

Mean 

[bits] 

CoV 

Peak/ 

Mean 

Mean 

[bits] 

CoV 

Peak/ 

Mean 

Mean 

[Mbps] 

Peak 

[Mbps] 

asterix 

119 

22,348 

0.90 

6.6 

268,282 

0.47 

4.0 

0.59 

1.85 

atp 

121 

21,890 

0.93 

8.7 

262,648 

0.37 

3.0 

0.55 

1.58 

dino 

203 

13,078 

1.13 

9.1 

156,928 

0.40 

4.0 

0.33 

1.01 

lambs 

363 

7,312 

1.53 

18.4 

87,634 

0.60 

5.3 

0.18 

0.85 

mr.bean 

150 

17,647 

1.17 

13.0 

211,368 

0.50 

4.1 

0.44 

1.76 

mtv 

134 

19,780 

1.08 

12.7 

237,378 

0.70 

6.1 

0.49 

2.71 

news 

173 

15,358 

1.27 

12.4 

184,299 

0.47 

6.0 

0.38 

2.23 

race 

86 

30,749 

0.69 

6.6 

369,060 

0.38 

3.6 

0.77 

3.24 

settop 

305 

6,031 

1.92 

7.7 

72,379 

0.18 

2.0 

0.15 

0.27 

simpsons 

143 

18,576 

1.11 

12.9 

222,841 

0.43 

3.8 

0.46 

1.49 

soccer 

106 

25,110 

0.85 

7.6 

301,201 

0.48 

3.9 

0.63 

2.29 

starwars 

130 

15,599 

1.16 

11.9 

187,185 

0.39 

5.0 

0.36 

4.24 

talkl 

183 

14,537 

1.14 

7.3 

174,278 

0.32 

2.7 

0.36 

1.00 

talk2 

148 

17,914 

1.02 

7.4 

214,955 

0.27 

3.1 

0.49 

1.40 

terminator 

243 

10,904 

0.93 

7.3 

130,865 

0.35 

3.1 

0.27 

0.74 

Table  2:  Simple  statistics  of  the  encoded  sequences 

•  Each  frame  consists  of  one  slice; 

•  GOP  pattern:  IBBPBBPBBPBB  (12  frames); 

•  Quantizer  scales:  10  (1),  14  (P),  18  (B); 

•  Motion  vector  search:  logarithmic/simple;  window:  half  pel,  10;  reference  frame: 
original; 

•  Encoder  input:  384  x  288  pels  with  12  bit  color  information; 

•  Number  of  frames  per  sequence:  40000  (about  half  an  hour  of  video) 

Some  parameters  might  not  be  optimal  with  respect  to  the  quality  of  the  MPEG  video 
sequence,  because  of  some  hardware  limitations.  We  used  a  Sun  Sparc  20  for  the  image 
processing  and  encoding,  and  captured  the  sequence  from  a  VCR  with  a  SunVideo  SBus 
board. 


3.1.1  Overview 

Table  2  shows  the  compression  rates  and  the  most  important  moments  of  the  frame  sizes, 
the  GOP  sizes,  and  the  corresponding  bit  rates  of  the  MPEG  sequences. 

Eor  the  sake  of  comparison  the  statistical  data  from  Mark  Garrett’s  Star  Wars  sequence 
[4]  is  also  presented. 
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the  From  Table  2  we  conclude,  that  typical  TV  sequences  like  sports,  news,  and  music 
clips  lead  to  MPEG  sequences  with  a  high  peak  bit  rate  and  a  high  peak-to-mean  ratio 
compared  to  movie  sequences.  These  properties  result  from  the  rapid  movements  of  a  lot 
of  small  objects,  which  increase  the  amount  of  data  necessary  to  encode  the  sequence. 

Unfortunately,  even  the  statistical  properties  of  the  sequences  of  the  same  category,  like 
movies  or  cartoons,  are  not  in  good  agreement.  For  example,  the  measurements  of  termi¬ 
nator  and  lambs  or  of  simpsons  and  asterix  have  no  moments  lying  close  together.  This 
will  lead  to  difficulties  in  hnding  traffic  classes  for  MPEG  video,  which  can  be  used  for 
GAG  and  UPC. 

In  the  remainder  of  this  section,  we  will  present  a  detailed  analysis  of  the  statistical  data 
of  the  dino,  soccer^  and  starwars  sequences. 


3.1.2  Frame  traces 

Figures  3,  4,  and  5  show  the  frame  size  traces  of  the  dino,  soccer,  and  starwars  sequences. 
The  I  frame  sizes  are  light  gray,  the  P  frame  sizes  black,  and  the  B  frame  sizes  dark  gray. 
The  appearance  of  the  three  traces  is  very  different.  The  dino  trace  is  rather  smooth, 
whereas  the  other  two  traces  show  a  large  number  of  rapid  changes  in  the  frame  sizes  of 
each  type  of  frames.  But  although  both  traces  have  this  property,  they  look  different. 
The  P  frames  of  the  starwars  trace  are  large  compared  to  the  I  frames.  The  soccer  trace, 
however,  shows  very  large  changes  in  any  type  of  frames,  and  the  B  frames  are  often  of 
the  same  size  as  the  P  frames.  This  indicates  a  lot  of  movement  in  the  input  sequence  of 
the  encoder,  since  the  B  frames  only  become  large,  if  the  predicted  image  will  be  poor 
because  of  the  amount  of  movement  and  additional  data  has  to  be  encoded  to  correct 
these  prediction  errors.  This  will  be  the  case  for  soccer  matches  and  for  a  lot  of  other 
sports  events. 


3.1.3  Distributions 

The  Figures  6,  7,  and  8  show  the  frame  size  histograms  of  the  I,  P,  and  B  frames  of  the 
dino  sequence.  The  dashed  curve  is  a  Gamma  pdf,  which  has  the  same  mean  and  variance 
as  the  histogram  frame  sizes.  The  good  agreement  of  the  histogram  and  the  Gamma  curve 
for  the  I  and  P  frames  becomes  more  obvious  if  we  use  a  QQ-plot  (quantile-quantile-plot), 
where  the  Gamma  quantiles  are  plotted  against  the  histogram  quantiles.  An  agreement 
with  the  dotted  line  indicates  that  both  pdf’s  are  equal.  The  solid  line  is  for  the  Gamma 
pdf  and  the  dashed  line  is  for  the  Lognormal  pdf,  which  has  the  same  parameters  as  the 
Gamma  pdf.  For  the  I  frame  sizes  (Figure  9)  both  the  Gamma  and  the  Lognormal  pdf 
are  good  to  very  good  approximations  of  the  histogram  pdf.  In  case  of  the  P  frames 
(Figure  10)  the  Gamma  pdf  is  in  good  agreement,  whereas  for  the  B  frames  (Figure  If) 
the  Lognormal  pdf  shows  better  performance. 

For  almost  all  encoded  sequences,  either  the  Gamma  or  the  Lognormal  pdf  is  an  useful 
approximation  of  the  frame  size  histogram  pdf’s  of  either  type  of  frame.  The  differences 
between  Gamma  and  Lognormal  approximation  performance  are  not  too  large  in  most 
cases.  Perfect  agreement  of  histogram  and  approximation  cannot  be  achieved  due  to 
hnite  frame  sizes. 
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This  leads  to  the  conclusion,  that  for  the  modeling  of  the  frame  sizes,  either  histograms. 
Gamma,  or  Lognormal  pdf’s  can  be  used. 

If  we  look  at  the  GOP  size  distributions,  we  obtain  similar  results.  Figures  12,  13,  and  14 
show  the  QQ-plot  for  the  dino,  soccer^  and  starwars  sequence.  Again,  the  Gamma  and 
Lognormal  quantiles  are  plotted  against  the  histogram  quantiles.  For  the  sequences  con¬ 
sidered,  the  Lognormal  distribution  is  a  good  approximation  of  the  GOP  size  histogram, 
but  the  Gamma  distribution  will  also  be  adequate. 


3.1.4  Correlations 

Time-dependent  statistics  are  important  in  the  case  of  video  traffic,  because  correlations 
of  the  data  streams  may  cause  performance  problems  of  the  ATM  network. 

First,  autocorrelation  functions  of  the  frame  sizes  and  of  the  GOP  sizes  are  presented.  The 
frame-by-frame  correlations  are  depending  on  the  pattern  of  the  GOP,  and,  in  principle, 
always  look  like  Figure  15,  if  the  same  GOP  pattern  is  used  for  the  whole  sequence.  The 
larger  positive  peaks  stem  from  the  1  frames,  the  smaller  positive  ones  from  the  P  frames, 
and  the  negative  ones  from  the  B  frames.  This  shape  reflects  the  relationship  of  the 
mean  frame  sizes  of  the  frame  types.  A  large  1  frame  is  followed  by  two  small  B  frames. 
Then  a  midsize  P  frame  is  produced  by  the  encoder,  which  is  followed  by  two  small  B 
frames  again.  The  pattern  between  two  1  frame  peaks  is  repeated  with  slowly  decaying 
amplitude  of  the  peaks. 

If  a  model  is  needed  which  reflects  the  frame-by-frame  correlations  of  an  MPEG  video 
traffic  stream,  the  GOP-pattern  based  shape  of  the  autocorrelation  function  has  to  be 
considered.  An  approximation  of  the  autocorrelations  function  is  presented  in  [3]. 

Based  on  the  frame  level  correlations,  it  is  difficult  to  get  a  clear  picture  of  the  long-range 
correlations  of  the  video  traffic  stream.  Thus,  the  autocorrelation  functions  of  the  GOP 
sizes,  i.e.  the  sum  of  the  frame  sizes  of  one  GOP,  are  considered. 

Figures  16,  17,  and  18  show  the  autocorrelation  functions  of  the  GOP  sizes  of  the  se¬ 
quences  dino,  soccer,  and  starwars.  In  addition,  the  dashed  line  shows  the  exponential 
function,  which  is  matched  to  the  empirical  autocorrelation  function  of  the  hrst  few  lags. 
A  curve  of  this  type  appears  if  the  GOP  size  process  is  memoryless.  If  the  autocorrelation 
function  of  the  statistical  data  is  above  the  exponential  function,  this  indicates  depen¬ 
dences  in  the  GOP  size  process.  In  Figures  16  and  18  this  is  clearly  the  case,  whereas 
the  autocorrelation  curve  and  the  exponential  curve  are  matching  well  in  Figure  17. 

This  result  makes  it  difficult  to  hud  a  GOP  layer  model  which  is  appropriate  for  all  types  of 
video  sequences.  On  the  other  hand,  it  is  often  sufficient  to  have  a  model  which  is  accurate 
in  terms  of  correlations  in  the  order  of  frames,  i.e.  tens  of  milliseconds.  Therefore,  it 
is  possible  to  neglect  the  GOP-by-GOP  correlations  and  to  use  only  distributions  and 
moments  of  the  GOP  sizes  to  model  the  GOP  size  process,  e.g.  with  a  Markov  chain, 
an  autoregressive  process,  or  simply  drawing  GOP  size  samples  based  on  the  GOP  size 
histogram. 

Another  way  to  detect  long-range  dependences  is  to  use  variance-time  plots,  R/S  plots, 
or  periodograms  [1,  8].  Here,  we  focus  on  the  R/S  plots,  because  it  is  a  robust  method 
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Sequence 

Hurst  exponent  H 

race 

0.99 

soccer 

0.91 

lambs 

0.89 

terminator 

0.89 

mtv 

0.89 

simpsons 

0.89 

talkl 

0.89 

dino 

0.88 

atp 

0.88 

mr.bean 

0.85 

asterix 

0.81 

news 

0.79 

starwars 

0.74 

talk2 

0.73 

settop 

0.53 

Table  3:  Hurst  exponents  of  the  encoded  sequences 

to  determine  the  asymptotic  Hurst  exponent  H  of  long  time  series.  An  introduction  in 
R/S  analysis  can  be  found  in  [9]. 

Figures  19,  20,  and  21  show  the  R/S  plots,  strictly  speaking  the  pox  plots,  of  the  frame 
size  sequences  of  dino,  soccer^  and  starwars.  The  slope  of  the  street  of  points  on  the 
diagrams  is  an  estimate  for  the  Hurst  exponent  H.  The  slope  is  computed  using  a  least 
squares  ht,  where  the  hrst  row  and  the  last  two  rows  of  R/S  values  is  not  considered. 
The  hrst  row  may  rehect  too  many  short-range  dependence  effects,  and  the  number  of 
R/S  values  of  the  last  row  is  too  small. 

The  estimated  parameter  H  for  the  dino  sequence  is  0.88,  for  the  soccer  sequence  it  is 
0.91,  and  for  the  starwars  sequence  0.74  is  estimated.  Time  series  without  any  long-range 
dependences  own  a  Hurst  exponent  of  0.5,  whereas  time  series  of  computer  traffic  can 
have  H-values  up  to  1.0  [4].  ft  is  interesting  to  notice  that  the  soccer  sequence  has  a  large 
H-value,  but  that  the  autocorrelation  function  of  the  GOPs  is  decaying  exponentially. 

ft  is  assumed,  that  in  case  of  video  traffic  a  larger  H-value  rehects  a  larger  amount 
of  movement  in  the  video  sequence  [1].  This  is  corroborated  by  Table  3  for  most  of 
the  encoded  sequences.  Only  the  H-values  of  talkl  and  starwars  do  not  go  with  this 
assumption.  In  the  case  of  starwars  the  H-value  is  low  compared  to  the  other  movies. 
However,  besides  the  settop  sequence,  all  sequences  have  H-values,  which  are  higher  than 
0.73,  and  the  existence  of  long-range  dependencies  can  be  assumed. 

If  the  model  of  the  video  traffic  should  have  long-range  dependence  properties,  a  class 
of  processes  called  fractional  differencing  processes  may  be  used  [4].  These  processes 
generate  time  series  with  given  H-values,  but  it  may  be  difficult  to  match  a  given  marginal 
distribution  for  the  generated  samples. 
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3.2  Layered  modeling  scheme 


In  this  section,  we  are  going  to  present  a  layered  modeling  scheme  for  the  development 
of  MPEG  video  traffic  models^. 

The  main  information  for  the  model  development  which  we  receive  from  the  MPEG  way 
of  coding  can  be  concluded  as  follows: 


•  There  are  three  frame  types:  I,  P,  and  B  frames. 

•  A  pattern  of  frame  types,  called  GOP,  is  repeated  continuously  to  create  the  enco¬ 
ded  frame  sequence. 

•  The  frames  of  one  single  GOP  strongly  depend  on  each  other. 


Moreover,  if  we  want  to  create  a  model  on  cell  level,  both  the  AAL  which  is  used  for  the 
transmission  of  the  video  and  the  information,  whether  the  cell  stream  is  shaped  before 
it  enters  the  network  or  not,  should  be  taken  into  account. 

Based  on  the  information  presented  up  to  this  point,  we  are  already  able  to  develop  a 
scheme  with  three  layers  (cf.  Eigure  2): 


•  GOP  layer, 

•  Erame  layer, 

•  Cell  layer. 


At  the  moment,  higher  layers,  like  scenes,  are  not  under  consideration  for  two  reasons. 
Eirst,  each  additional  layer  adds  some  complexity  to  the  model  and  we  want  to  have 
simple  models.  Second,  in  most  cases  the  time  scale  of  one  GOP,  i.e.  about  half  a 
second,  is  large  enough  in  the  ATM  context. 

Having  decided  on  the  layers,  we  have  to  dehne  the  statistical  properties  of  each  layer 
and  of  the  way  the  layers  interact. 

Based  on  the  results  of  Section  3.1,  we  are  able  to  select  a  stochastic  process  for  each 
layer,  which  is  appropriate  for  our  purposes  or  analysis  technique,  respectively. 

After  this  step,  we  have  to  lay  down  the  way  the  layers  depend  on  each  other. 

Eor  example,  if  we  want  to  generate  a  frame  size  sequence  based  on  the  GOP  size  process, 
we  have  to  consider  the  structure  of  the  GOP  pattern,  which  tells  us  the  order  of  the 
types  of  frames.  The  simplest  way  to  hnd  the  frame  sizes  based  on  a  GOP  size  sample 
is  to  use  a  scaling  factor  for  each  frame  of  the  GOP,  where  the  scaling  factors  are  the 
mean  sizes  of  the  frames  of  one  GOP  devided  by  the  mean  GOP  size  of  a  given  data  set. 
More  complex  models  may  use  frame  size  histograms  or  approximate  pdf’s  to  generate 
the  frame  size  sequence  (cf.  Eigure  2). 

^An  overview  of  the  video  modeling  literature  can  be  found  in  [10] 
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Figure  2:  Layered  video  traffic  modeling  scheme 


9 


If  we  want  to  obtain  a  cell  level  model,  we  have  to  make  up  our  mind  on  the  way  the 
frames  are  broken  into  cells.  This  will  depend  on  the  considered  ATM  Adaption  Layer 
(AAL)  and  on  the  existence  of  shaping  facilities  between  video  source  and  ATM  network. 
If  a  statistical  analysis  of  video  cell  stream  measurements  is  available,  it  will  be  possible 
to  base  models  directly  on  this  material.  This  may  lead  to  simpler  models  for  the  cell 
process. 

The  presented  model  development  scheme  is  not  a  recipe  to  get  a  perfect  video  traffic 
model.  It  is  more  like  an  outline  of  a  variety  of  stochastic  modules  and  the  description 
how  they  interact  in  the  case  of  video  traffic.  The  model  developer  will  have  to  choose 
the  modules,  which  are  appropriate  for  his  analysis. 

We  want  to  point  out,  that  any  model  should  be  validated.  Any  model,  even  complex 
ones,  are  based  on  simplifying  assumptions,  like  independence  assumptions.  Thus,  to 
obtain  useful  and  reliable  performance  analysis  results,  it  is  important  to  know  how 
these  assumptions  affect  the  results  of  the  analysis. 


4  Conclusions 

Modeling  of  VBR  video  traffic  is  often  difficult,  because  of  the  statistical  complexity  of 
the  empirical  data  sets,  for  example  their  layered  structure  and  the  correlations  on  several 
time  scales. 

In  this  paper,  we  present  a  detailed  statistical  analysis  of  new  MPEG  sequences,  which 
we  encoded  at  our  institute.  Each  sequence  consists  of  40000  frames.  We  were  able  to 
corroborate  several  results,  which  are  known  from  the  analysis  of  other  video  sequences: 
1.  the  frame  and  GOP  sizes  can  be  approximated  by  Gamma  or  Lognormal  PDE’s,  2. 
there  are  long-range  dependences  in  the  frame  sequences,  which  lead  to  Hurst  exponents 
from  0.7  up  to  about  1.0. 

The  new  data  sets  are  also  compared  to  the  well  known  Star  Wars  data  set  from  Mark 
Garrett.  It  can  be  concluded  that  with  respect  to  the  statistical  properties  the  Star  Wars 
sequence  is  a  good  representative  of  the  class  of  MPEG  video  traffic,  but  that  it  will  be 
misleading  to  dimension  ATM  networks  based  only  on  this  data  set.  There  are  sequences 
like  TV  broadcasts  of  sports  events,  where  performance  problems  like  buffer  overflows 
are  more  likely  than  with  the  Star  Wars  sequence. 

Based  on  the  statistical  analysis,  a  layered  modeling  scheme  for  MPEG  video  traffic 
is  presented.  We  describe  the  properties  of  each  layer  and  the  way  they  interact.  In 
addition,  some  guidance  is  given  on  how  to  develop  video  models. 
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Figure  3:  Frame  size  trace  of  the  dino  sequence 
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Figure  5:  Frame  size  trace  of  the  starwars  sequence 
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